Astronomy & Astrophysics manuscript no. wslensing 


© ESO 2009 


April 22, 2009 





Combining weak and strong cluster lensing: Applications to 

simulations and MS 2137 

Julian Merten 1 , Marcello Cacciato 2 , Massimo Meneghetti 3 ' 4 , Claudia Mignone 1 ' 2 , and Matthias Bartelmann 1 

1 Institut fur Theoretische Astrophysik, Zentrum fur Astronomie der Universita Heidelberg, Albert-Ueberle-Str.2, 69120 Heidelberg, 
Germany 

e-mail: jmerten@ita.uni-heidelberg.de 

2 Max-Planck-Institut fur Astronomie, Konigstuhl 17, 69117 Heidelberg, Germany 

3 INAF-Osservatorio Astronomico di Bologna, Via Ranzani 1, 40127 Bologna, Italy 

4 INFN-National Institute for Nuclear Physics, Sezione di Bologna, Viale B. Pichat 6/2, 40127 Bologna, Italy 
Accepted for publication in A&A on March 28, 2009. 

ABSTRACT 

Aims. While weak lensing cannot resolve cluster cores and strong lensing is almost insensitive to density profiles outside the scale 
radius, combinations of both effects promise to constrain density profiles of galaxy clusters well, and thus to allow testing of the CDM 
expectation on dark-matter halo density profiles. 

Methods. We develop an algorithm further that we had recently proposed for this purpose. It recovers a lensing potential optimally 
reproducing observations of both strong and weak-lensing effects by combining high resolution in cluster cores with the larger- 
scale information from weak lensing. The main extensions concern the accommodation of mild non-linearity in inner iterations, the 
progressive increase in resolution in outer iterations, and the introduction of a suitable regularisation term. The linearity of the method 
is essentially preserved. 

Results. We demonstrate the success of the algorithm with both idealised and realistic simulated data, showing that the simulated 
lensing mass distribution and its density profile are well reproduced. We then apply it to weak and strong lensing data of the cluster 
MS 2137 and obtain a parameter- free solution which is in good qualitative agreement with earlier parametric studies. 

Key words. Gravitational lensing - Galaxies: clusters: general - Galaxies: clusters: individual: MS 2137 - Cosmology: theory 



1. Introduction 

The mass distribution in dark-matter halos and the level of sub- 
structure in them are among the central predictions of the CDM 
paradigm for cosmic structure formation. The density profile 
should asymptotically fall off oc r~ 3 a t large radii r and flatten 
considerably within a radial scale r s (Nav arro et al.||1996] ).The 
mass distribution should be richly substructured by sublumps 
of matter with a differential mass function approximated by a 
power law, dn/dM oc M a with a slope slighly shallower than 
a = -2 ( |Madau et al.]2008l[Springel et al.|2008] ). 

Galaxy clusters should be weakly influenced by baryonic 
physics, thus their density profiles and mass distributions out- 
side the cooling radius should well reflect those expected for 
dark matter. Do they? Although tentative answers exist, show- 
ing that estimated density profiles do at least not contradict the 
CDM expectation, accurate constraints are still missing. Due to 
its insensitivity to the physical state of the matter, gravitational 
lensing is perhaps the most promising tool for determining mat- 
ter distributions. 

Weak lensing lacks the resolution necessary to constrain the 
density profile in cluster centres, while strong lensing is confined 
to the innermost cluster cores. In combination, they may be able 
to test the CDM predictions on density profiles well. 

Several methods h ave been suggested to combine weak and 
strong cluster lens ing ( Bradac et al. 2005} Cacciato et al.|2 006 ; 
Die go et al.||2007| ). Among them is our own algorithm aiming 
at the lensing potential. It is based on minimising a^ 2 function 
comparing observed shear measurements with suitable second 



derivatives of the potential. Expressing the derivatives in terms 
of finite differences leads to a system of linear equations whose 
direct inversion yields the solution. 

We extend our earlier work in several ways. First, we no 
longer use the lowest-order approximation in which measured 
ellipticities estimate the shear, and introduce the reduced shear 
instead. The non-linearity accommodated in this way can be re- 
solved into an iterative scheme using linear inversion in each 
step. Second, we wrap the algorithm into an outer iteration loop 
in which the grid resolution is progressively enhanced. While 
this step introduces correlations between adjacent pixels that 
have to be dealt with, it prepares the insertion of the strong- 
lensing constraints available in cluster cores. Third, we introduce 
a regularisation term for the two purposes of avoiding overfit- 
ting and smoothly joining the strong- and weak-lensing solu- 
tions. Finally, to account for the additional computational time 
we enabled the code to run on parallel machines. 

We investigate the performance of our algorithm using two 
sets of synthetic data, one idealised and one realistic, before 
we proceed to apply it to the well-known strong-lensing cluster 
MS 2137, for which we obtain a high-resolution, parameter- free 
reconstruction. 

A brief summary of the lensing notation in Sect. [2] is fol- 
lowed by an outline of the method in Sect. [3] and a description 
of its implementation in Sect. [4] We present the results in Sect. [5] 
and conclude in Sect. [6] Details of the algorithm are given in 
Appendix [A| 
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2. Lensing formalism 

2.1. Basic quantities 

We adopt the standard notation introduced to describe iso- 
lated lenses in the thin-lens approximation (e.g. IP. S chneider 
T992| [Narayan & Bartelmannl[T^96| [P. Schneider||2006| ). Two- 
dimensional, projected lensing mass distributions are covered 
by angular coordinates 6 = (61,62). The lensing potential ifs(0), 
which is the appropriately scaled Newtonian potential projected 
on the sky, contains all information necessary to describe a 
single-plane lens. The deflection angle, convergence and shear 
are derivatives of xf/iO) with respect to 6\ and 62, 



a = V<A 

1 2 l(d 2 d 2 ) 1 
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2.2. Lensing by galaxy clusters 

We concentrate on lensing by galaxy clusters. Let us first fo- 
cus on weak lensing, which means k <*c 1 (see Bartelma nn &| 
Schneider (2001 ) for a review). To first order, shape distortions 
of background galaxies are determined by the Jacobian matrix 
of the lens mapping, 



&(0) = \Sij- 
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The intrinsic ellipticities of the background galaxies require that 
their images be averaged to extract the weak-lensing signal from 
them. We assume that averages over ten or more galaxies are 
necessary for the uncertainty of an individual ellipticity mea- 
surement to fall below 10% of the signal ( Cacciato et al.|2006 ). 
The expectation value for the measured ellipticity is then 



(s) 
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where the reduced shear 



8(0) = 



7(6) 

\-K(0) 



and the distance weight function 



Z(z)=^^H(z-z d ) 



(7) 



(8) 



appear. In the last equation, Zd is the redshift of the lens, while 
Doo and D doo are the angular-diameter distances between ob- 
server and infinity and between lens and infinity, respectively. 

While the probes of weak lensing are slightly distorted back- 
ground galaxies whose signal needs to be treated statistically, 
strong lensing is based on greater effects. Another difference 
from weak lensing is that strong lensing only occurs in galaxy 
clusters near their cores where the lens becomes critical. These 
regions are typically of the order of 100 kpc in radius. The main 
observations are 



- multiple images of background sources, which all carry the 
same spectral information of the source, which enables their 
unambiguous identification through a spectral or colour anal- 
ysis, and 

- highly distorted images of background sources like gravi- 
tational arcs or arclets, which lie close to critical curves of 
clusters. 

Critical curves are closed point sets in the lens plane where 
the Jacobian becomes singular, 

detJ?l crit = (1 - * crit ) 2 - Ircritl 2 = . (9) 

3. Outline of the method 

Our non-parametric maximum-likelihood reconstruction 
method aims at recovering the lensing potential i//. The reasons 
for this choice are that the lensing potential is much smoother 
than e.g. the convergence, which renders it much less suscepti- 
ble to noise, and that both convergence and shear are derivatives 
of the lensing potential so that no integration is needed to 
convert one to the other. The method described and applied here 
develops fu r ther and extends those presented in Bartelmann 
[eTaLl ( [T996l ); |Seitz et alJp9^ ; |CaccikoetaLl ( |2006> . 

The method takes as input the result of galaxy- shape and 
strong-lensing measurements, i.e. the two ellipticity parameters 
per galaxy and the strongly lensed images at their angular posi- 
tions. As we pointed out before, an ellipticity measurement of a 
single galaxy image is useless as a weak lensing signal because 
of the intrinsic source ellipticity. Each data point is thus obtained 
by averaging over a certain number of background galaxy ellip- 
ticities. 

We then divide the observed galaxy-cluster field into a grid 
of N cells, assign an averaged ellipticity to each cell, and 
thus obtain N data points for our x 2 -minimisation. The ensu- 
ing reconstruction will strongly depend on the grid resolution. 
Furthermore, if a number of M grid cells, which we shall call 
pixels from now on, contains strongly lensed images, we gain M 
additional constraints for the reconstruction. 

Since the weak and strong-lensing constraints are indepen- 
dent of each other, but reflect the same underlying gravitational 
potential, the overall x 2 becomes the sum of two independent 
contributions, 

x*W)=xiM+xiW. do) 

Our method is non-parametric in the sense that it does not as- 
sume a parameterised model for the mass or potential distribu- 
tion. It assigns an initially unknown potential value to each grid 
point and refines the set of potential values on the grid during 
the x 2 -minimisation. We are thus searching for a discrete repre- 
sentation of the lensing potential, which is optimally capable of 
reproducing the observed lensing effects. 

The reconstruction proceeds by minimising x 2 with respect 
to the potential values if/i at all grid positions /, 







(11) 



dif/i dif/i dif/i 

The main advantage of the maximum-likelihood approach is its 
enormous flexibility. In principle, one can incorporate every ad- 
ditional observable constraint that can be connected in some way 
to the lensing potential. This is of course not restricted to lensing. 
One simply has to add separate and independent x 2 -functions 
and minimise their sum with respect to the discrete potential 
values. Even if we are only using weak and strong lensing con- 
straints for now, future improvements of our method should in- 
clude as many of these constraints as possible. 
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3.1. Resolution issues 
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Fig. 1. Tb/? panel: Very coarse grid of 10 x 10 pixels. An ex- 
ample for a pixel with ten galaxies included is shown in blue. 
The irregular distribution of galaxies with several void areas is 
clearly seen. This problem is fixed in the bottom panel, showing 
a grid of 20 x 20 pixels. The circles show the adaptive averaging 
scales for each individual pixel, which causes overlap, illustrated 
in blue. 



Before we can start computing the ^-functions for our joint 
lensing reconstruction, we have to address two issues concern- 
ing the resolution of our reconstruction grid. The first is related 
to the weak-lensing regime. If we want to average over at least 
ten galaxies per pixel, the typical background-galaxy density in 
the field of a cluster would not allow a higher resolution than 
~ 10 x 10 pixels, which is of course way too coarse to see any 
cluster substructures. In addition, pixels of a homogeneous grid 
can occur which contain fewer than 10 or even no galaxies be- 
cause of the inhomogeneous, random galaxy distribution. 



We solve these problems by an adaptive averaging proce- 
dure, in which we average galaxy ellipticities within circles 
around each pixel centre. Their radii are stepwise increased un- 
til each circle contains the desired number of galaxies. Different 
pixels will need different radii, depending on the local galaxy 
density in that area of the field. On a fine grid, galaxies shared 
by neighbouring pixels will of course cause these pixels to be 
correlated (see Fig[T]). 

The second issue concerns the strong-lensing regime, in par- 
ticular the arc positions. Since strong lensing is confined to much 
smaller scales than weak lensing, essential positional informa- 
tion is lost if the strong-lensing constraints are incorporated at 
the same resolution as weak lensing. This requires us to refine 
the grid near cluster centres until it is capable of resolving the 
exact arc positions (see Fig. [2]). 



Fig. 2. Zoom into the inner part of the cluster. The left panel 
illustrates that a coarse, 20 x 20 grid covering the whole field is 
by far not able to resolve arc positions. The right panel shows a 
100 x 100 pixel resolution with respect to the whole field, which 
is able to follow the arc positions. 



3.2. Defining maximum likelihood functions 

The most importan t ing redient of our cluster reconstruction is 
the x 2 -function (Eq. 10 ) that we need to minimise. Consider first 
the weak-lensing term. As discussed before, the weak-lensing 
grid pixels are correlated because of the adaptive-averaging pro- 
cedure, and the expectation value of the ellipticity is the reduced 
shear rather than the shear. Thus we find, for \g\ < 1, 



Z(z)yW 
l-Z(zM^) 



\-Z{ Z )K{llf)j 



(12) 

where (s) represents the results of the averaging process for each 
pixel. We are using Einstein's sum convention here and below. 
The case \g\ > 1 is not relevant in our reconstruction because it 
only affects at most very few pixels on the reconstruction grid. 
Eq. T2| illustrates the first major improvements in our method 
since |Cacciato et aL] ( |2006| ). First, we introduce the reduced 
shear instead of the shear as a reconstruction constraint. 
Furthermore, we introduce the adaptive averaging technique 
which adapts to the actual galaxy distribution of the field. Thus, 
the error in the ^-function is quantified by the non-diagonal co- 
variance matrix dj, which we evaluate now. The standard de- 
viation o~ i for each weak lensing pixel is obtained during the 
averaging process as the standard deviation from the mean. This 
standard deviation has three contributions assumed independent, 



CT = 0~int + CTmeas + 0~t 



sys 



(13) 
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which are the noise due to the intrinsic ellipticity cr int , noise in- 
troduced by measurement uncertainties <T mea s and a systematic 
noise term <x sys which arises from the fact that the galaxies over 
which we average cover spatial ranges in which the properties 
of the lens may change. Here one can also see that the radii of 
the averaging circles should not become too large, otherwise the 
coherence in the lensing signal tends to be lost. 

Starting from the definition of the covariance matrix, 



- (Xi))(Xj - (Xj))) , 



(14) 



where X[ is the ellipticity sample of the pixel with index i and us- 
ing that the correlation between two pixels due to the averaging 
process will be proportional to the overlap between the averag- 
ing circles attached to them as shown in Fig.[T} we arrive at 



dj = Wijo-iO-j 



(15) 



The weight factors are obtained from the number of galaxies 
contained in the overlap area of both circles, 



Nl + Nj 



(16) 



where Nj and Nj are the galaxy numbers contained in the circles 
around pixel centres i and j, and is the number of galax- 
ies contained in their overlap. These weights have the expected 
properties, i.e. they are unity if i = j and vanish for completely 
independent and uncorrected pixels. 

The strong-lensing term looks much simpler. By definition, 
the determinant of the Jacobian vanishes on the critical line. 
Thus, if we know which pixels are traversed by the critical curve, 
we can define 



(det mmfi ( (1 - z fe)«W) 2 - l z feW)l 2 )' 



(17) 



where the strong-lensing error estimate cr s is mainly caused by 
the finite pixel size of our grid, since this determines the inac- 
curacy of the position of the critical curve. We approximate this 
uncertainty to first o rder with the help of the Einstein angle (see 
|Cacciato et al.|2006| ), 



80 



60 ■• 



66 



(18) 



with the pixel size 66. This expression holds exactly for an 
isothermal sphere, but can be used as a good approximation for 
the noise of the critical-curve position. 

To evaluate Eq. ( [TT] ), we have to connect convergence and 
shear to the grid values of the potential, which we shall do in the 
next section. 



3.3. Lensing potential and convergence maps 

The lensing potential which we are reconstructing is not directly 
observable, but linear combinations of its second derivatives are. 
Therefore, we can always add a constant, a linear function in 6, 
or a harmonic function to it without changing the observables. 
Furthermore, if ellipticities are the only quantities measured, the 
lensin g potential is affec ted by the so-called mass-sheet degen- 
eracy ( |Falco et aLp 985 ), which arises because ellipticities are 
invariant against isotropic scaling of the Jacobian matrix. Note 
that such transformations also leave the critical curves of a lens 



unchanged. These degrees of freedom allow th e following trans- 
formation of the potential (Brada c et al.|2004 ): 



mz)^^ f (6,z) 



I- A 



-0 + Aifs(0,z) ■ 



(19) 



where X ± is an otherwise arbitrary constant. 

This is the reason why the reconstructed, discretised poten- 
tial may look shifted or distorted. However, this is not a problem 
because we only need its curvature. We obtain physically mean- 
ingful quantities like convergence or shear by simply applying 
Eqs. Q, ^ and ^ to if/. We shall use the convergence mainly 
to describe the reconstruction of a galaxy cluster, because it intu- 
itively reflects the cluster's mass distribution through its surface 
mass density. 



Due to the mass-sheet degeneracy ( [Falco et aL] |T985), the 
convergence is unique only up to the transformations 



k(0, z) -> k'(0, z) = (1 - X) + MO, z) 



(20) 



with A ± 0. If our observed field is sufficiently large, we assume 
that k — > towards the field boundary, so we can use Eq. ( [20} 
again for normalisation. 

More elaborate methods require observables in the recon- 
struction which are not invariant under the mass-sheet transfor- 
mation and depend on potential and convergence. One example 
is the source magnification, which Bartelm ann et al.| ( [l996| ) sug- 
gested to include in the maximum-likelihood approach. Another 
approach to lift the mass-sheet degeneracy was proposed by 
Bra dac et al. ( 2004 2005 ), who proposed to exploit the knowl- 
edge of the source-redshift distribution. 

Having obtained the convergence, possibly transformed 
according to the mass-sheet degeneracy, mass estimates are 
straightforward. If we know the lens redshift and fix the cosmo- 
logical model, we also know the physical area of one pixel. If we 
additionally know at least the mean redshift of the sources, we 
can calculate the surface mass density, which yields an estimate 
for the total cluster mass after summing over the whole grid. 
Recall, however, that this returns a distance-weighted integral 
over the entire mass of cosmic structures along the line-of-sight 
from the observer to the sources. 



4. Implementation 

We shall now proceed to the specific implementation and the de- 
scription of the required numerical methods and algorithms. As 
we already pointed out we significantly developed our method 
with the introduction of an adaptive averaging scheme and the 
use of the reduced shear instead of the shear. The price that 
we pay for these improvements is correlated reconstruction pix- 
els and a relatively complicated two-level iteration scheme that 
we will describe in this section. As a result the runtime of our 
method increased dramatically which made it necessary to in- 
crease the speed of the reconstruction algorithm. The most im- 
portant step towards speeding up the calculations is the paralli- 
sation of our code, using the well-known MPI library 



4.1. Preparing weak lensing data 

We start with the analysis of the weak lensing data. It is provided 
in the form of a table containing columns for the position and 
ellipticity measurement for each distorted background galaxy. 
It should be noted that the coordinates are arbitrary as long as 
they all refer to the same coordinate system. We express the co- 
ordinates in arcseconds relative to the brightest cluster galaxy 
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(BCG). The reconstruction grid is set up by assigning coordi- 
nates to each pixel centre. 

The adaptive averaging process proceeds by enlarging cir- 
cles around each pixel centre until they contain a pre-assigned, 
constant number of background galaxies. Once this number of 
galaxies is reached, the average and the standard deviation of 
the ellipticity are calculated and assigned to the pixel. 

The covariance matrix between two pixels is determined by 
the number of galaxies shared between them. Its final entries 

because we can now calculate the 
). This procedure has to be done for 



are obtained using Eq. ( [15 
weightings Wy (see Eq. 



16 



both ellipticity components, and the resulting covariance matri- 
ces must be inverted. 



4.2. Preparing strong lensing data 

Handling the strong-lensing data, we have to cope with the fact 
that we cannot observe the critical curves directly. We thus need 
a good approximation for their locations, which is given by arc 
positions that can be observed very well. We show in Fig.[3]that 
arcs follow the position of the critical curves, as long as the res- 
olution of the grid is not extremely high. However, even at the 
higher resolution of the finely resolved central grid, the differ- 
ence in the pixel positions between ar cs and critical curves is at 
most two pixels. [Cacciato et aT]p006| ) showed that deviations of 
this size do not affect the reconstruction significantly. 

A more severe problem is that the arcs sample the critical 
curves only very sparsely. We cannot expect to obtain full knowl- 
edge of the critical curve through observations. It will be one 
aim of future work to use high-resolution observations of clus- 
ter fields which tend to show more strongly lensed images and 
thus allow tracing of the critical curve in more detail. Another 
possibility would be to rely on critical-curve reconstructions 
from parametric strong-lensing analyses, and to feed that criti- 
cal curve into the code. The drawback of this approach is that 
one gives up the completely non-parametric nature of the recon- 
struction by using profile assumptions in the critical-curve deter- 
mination. In either case, the critical curves are characterised by 
a table listing the approximate positions of critical points. 
We finally point out that we are using arcs as approximate indica- 
tors for critical-curve locations rather than multiple-image sys- 
tems as strong-lensing constraints. This is for several reasons, 
first of all the identification of multiple image systems is not 
always possible since it requires multi-color observations and 
the subtraction of cluster members to look for hidden images. 
Moreover, due to resolution issues the reconstruction cannot be 
extremely accurate in the cluster centre and as a result we do not 
expect large changes by using multiple images instead of critical 
points. 

But nevertheless one should use as many constraints as possible, 
which is the reason why future versions of our method will also 
contain multiple-image system information, if available, to give 
an optimal reconstruction result. 



4.3. Grid methods 

We now proceed to combine lensing theory and the measure- 
ments, and to implement the method num erica lly. 
We defined the ^-functions in Sect. 



3.2 



using the discre- 
tised lensing-potential values as minimisation parameters. The 
observable to be reproduced in the case of weak lensing is the 
reduced shear, and the location of a vanishing determinant of the 
Jacobian in the case of strong lensing. 




Fig. 3. Arc position and estimated position of the critical curves 
for the real cluster MS 2137. The critical curves wer e estimated 
by a parametric strong lensing reconstruction from Comerford 
|et al.| ( [20Q6] ). Left panel shows 32x32 grid resolution where one 
can see no deviation. Right panel shows 75x75 where one sees 
that the deviation in position is still small. 



In order to minimise the resulting x 2 -function with respect 
to the lensing potential, we have to write the convergence and 
the shear in terms of the discrete potential values. The relation 
between these quantities is given by Eqs. ([2]), ^ and ^ with 
the second derivatives replaced by finite-differencing schemes. 

Given a certain grid resolution, we want to obtain the lensing 
potential at each grid position (x, y) with \ < x < M,\ <y < N. 
We enumerate these grid positions sequentially line by line and 
use the central difference quotients for discrete representations 
of the second derivatives at a given grid position . Our finite dif- 
ferencing scheme is identical to that chosen by |Cacciato et al. 
( 2006 ), and the corresponding coefficients are given in Fig. |4j 
At the edges and borders of the grid different (one-sided) finite- 
differencing schemes need to be used. 

With our way of enumerating the pixels, we can write the 
discrete potential iff on the N x M grid in the form of a vector 
iff e R^ M . This only slightly complicates addressing the correct 
positions in the vector. Direct left and right neighbours are just 
separated by ±1 positions in the vector, while top and bottom 
neighbours are separated by ±N positions. The advantage of this 
enumeration is that the finite differencing becomes a simple ma- 
trix multiplication, 



Ki 



7t = Gftfkj 



(21) 
(22) 
(23) 



Here, 7C ZJ , Q\., Q 1 . are sparse band matrices encoding the in- 
formation on the finite-differencing scheme. The fact that we 
know these matrices perfectly is the key point to increase the 
speed of our reconstruction algorithms, which plays an impor- 
tant role with respect to runtime, besides the parallelisation of 
the code. Without several numerical tricks the runtime of a com- 
plete reconstruction would not stay at an acceptable level. 

Using these finite-differencing schemes, x 1 can be min- 
imised by solving the linear system of equations 



Bu&k = <Vi 



(24) 



where the coefficient matrix and the result vector *V\ contain 
on the one hand observed ellipticity and critical curve informa- 
tion and on the other hand the convergence and shear matrices. 
The calculation is detailed in the Appendix. 
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Fig. 4. Finite differences schemes. Written in the cells are the coefficients in the difference quotients. Left panel: Scheme for the 
convergence; centre panel, scheme for the first shear component; right panel, scheme for the second shear component. 



4.4. Regularisation and 2-step-iteration 
AAA. Regularisation 

We now introduce a regularisation term Riif/) into the;^ 2 function 
Eq. (TTOb to obtain 



(25) 



The regularisation term depends on the potential and is defined 
such as to disfavour unwanted solutions. This is necessary to pre- 
vent the reconstruction from following intrinsic noise patterns, 
which do not reflect intrinsic features of the true underlying po- 
tential. Here, we use a simple regularisation term scheme simply 
comparing the convergence obtained with that found in the pre- 
ceding iteration, 



(26) 



Its amplitude r/i controls the relative strength of the regulari- 
sation and is crucial for the reconstruction. It should be cho- 
sen such that the overall x 2 is of order unity and of course 
low enough for changes in the reconstruction to take effect. 
Minimising the regularised^ 2 of Eq. (26) again leads to a linear 
system which contributes one additional term to the coefficient 
matrix and the result vector. The complete calculation is pre- 
sented in the Appendix. In our reconstruction we also regularize 
on the two shear components which give similar terms. Note that 
it is not useful to directly regularize the potential since it looks 
very different in each iteration step, as described in Sect. [33 



4.4.2. Inner-level iteration 

As the full expressions for the x 2 minimisation show, it is not 
precisely a linear system, but includes non-linear prefactor terms 
in the coefficient matrix and the result vector . We solve this prob- 
lem in the same way as Bradac et al.| (f2005) did by means of an 
iterative approach in which the non-linear terms are computed 
from the preceding iteration. Starting from a first guess for the 
convergence, we express the corresponding non-linear terms by 
constant factors, as can be seen in the Appendix. 

Bradac et al. (2005 ) showed that the initial guess of the con- 
vergence does not affect the final reconstruction, but at most the 
number of iterations. We confirm their result, which implies that 



the initial guess of a flat convergence is appropriate. We min- 
imise x 2 an d obtain a solution for the lensing potential. From 
it, we calculate convergence and shear, which we insert as new 
guesses into the non-linear terms in the next step of the recon- 
struction. This yields a convergent iterative process. We control 
the convergence of the procedure by comparing its results be- 
tween subsequent iterations. If the change falls below a given 
threshold, we stop the iteration. The drawback of this method is 
that we now need several iterations (3-5 in practice) for a com- 
plete reconstruction, which takes some time at high resolution. 

4.4.3. Outer-level iteration 

The regularisation term also helps in a second type of iteration, 
which we call outer-level iteration. The background-galaxy den- 
sity of today's observations allows a resolution of only ~ 10 x 10 
uncorrected pixels. Higher resolution is desirable at the expense 
of correlated pixels, which will affect the reconstruction. Also 
the convergence values in some individual pixels will increase 
at higher resolution, which renders the initial guess of a vanish- 
ing convergence increasingly less accurate, giving rise to many 
inner-level iterations. 

These problems can easily be avoided by introducing another 
iteration as follows: 



- One begins at the lowest possible resolution, where the pixels 
are not or almost not correlated. This resolution will be too 
coarse for fine- structure noise patterns to appear. 

- Starting from the initial convergence set to zero, the inner- 
level iteration is performed until the reconstruction con- 
verges. 

- The obtained result for the potential is then interpolated by a 
bicubic spline algorithm to a slightly higher resolution. 

- This interpolation is taken as the comparison function in the 
regularisation and as a new initial guess in the inner-level 
iteration at the higher resolution. 

- This process is repeated until the final resolution is reached. 

This two-level iteration delivers by far the best reconstruc- 
tion results, but with the disadvantage of increased CPU time 
due to the higher number of iterations. The result of the gradual 
transition from low to high resolution can be seen in Fig. [5] 
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Fig. 5. Convergence maps of a simulated cluster. Left panel. 
Initial reconstruction starting at a resolution of 15 x 15 pixels. 
Right panel: Final resolution of 32 x 32 pixels, reached after 
several iterations. 



because no weak-lensing data points are available at the high res- 
olution required in the core. The strong-lensing data points, how- 
ever, are available at high resolution, because the critical points 
can be determined very accurately. Yet, the weak-lensing recon- 
struction enters even at this resolution level through the regu- 
larisation term. Thus, the result is based on the weak-lensing 
constraints. 

We finish the reconstruction by inserting the high-resolution 
cluster-core result into the result obtained at a coarser resolution, 
which consists of the complete cluster field. Due to the regulari- 
sation function the strong-lensing result fits nicely into the weak- 
lensing results since we do not allow that the two different recon- 
structions differ significantly at pixels where no strong-lensing 
constraints are availa ble. This is also the las t major change in our 
method compared to |Cacciato et al.| ([2006 ). The use of the reg- 
ularisation function as a tool to match results on different scales 
improves the quality of our reconstructions significantly. 



4.5. Strong lensing at high resolution 
4.5.1. Interpolation 

Even when a weak-lensing reconstruction as described in the 
previous sections has converged, the grid resolution is far from 
being fine enough to follow the shape of the critical curves. We 
deal with the problem by interpolating the lensing potential from 
the final weak-lensing reconstruction and calculate convergence 
and shear at that refined resolution. Then, we zoom into the clus- 
ter core where the strong-lensing information is available. The 
interpolation is done by a bicubic spline interpolation routine 
which is reliable enough to create just a small amount of addi- 
tional noise. The interpolated result serves as template for the 
following high-resolution reconstruction. 




Fig. 6. Left panel: Complete combined weak and strong lens- 
ing reconstruction at low resolution of a simulated cluster. Right 
panel: Zoom into the interpolated cluster core of the low- 
resolution reconstruction. 



5. Results 

5.1. Tests with synthetic data 



We first repeat the tests also carried out in |Cacciato et al.| (2006). 
We take simulated clusters from the N-body simulations de- 
scribed in Bartelma nn et al. ( 1998] ) and compute maps of their 
reduced shear and their critical curves. Based on this informa- 
tion, we try to reconstruct the known potential of the simulated 
cluster. Since this is an idealised lensing scenario which does not 
include a realistic background-galaxy distribution or image anal- 
ysis, it is sufficient to compare the convergence map obtained by 
the reconstruction with the real convergence map of the simu- 
lated cluster. The results confirm the reliability of our method 
and are shown in Figs. [7] and [8] In particular, Fig. [8] shows how 
significantly the results are improved when the strong-lensing 
constraints on a refined grid are added. Otherwise the central 
density peak is underestimated by almost 20%. 





Fig. 7. Left panel: Convergence map of the high-resolution re- 
construction. As one can see, the resolution is much higher in 
the inner part of the map. The colour scale is logarithmic and 
the contours start at k = 0.1 spaced by Ak = 0.08. Right panel: 
The convergence map of the original cluster rebinned at the res- 
olution of the reconstruction. The colour scale and the contour 
levels are identical in both panels. 



4.5.2. Modified maximum likelihood function 

Now, we have to remove the weak-lensing term from the x 2 
function, 

^ffl=xtM + RW), (27) 



5.2. The lensing simulator 

Next, we use the method detailed in Meneghett i et al.| ([2008) to 
simulate an observation of another cluster field. The target of 
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Fig. 8. The radial k profile along the main diagonal of the critical 
cluster field. 



suit is shown in Fig. [TT] and shows a good agreement with the 
real cluster mass on all scales. Note that we plot both reconstruc- 
tions here, which are practically indistinguishable. 




Full critical curve 
Knot positions 
Critical curve estimator 



this simulated observati on is the galaxy cluster g72, described in 
several previous papers ( |Dolag et al.|2005[|"Puchwein et al.|2 005 ; 
Meneghetti et al. 2007 ). This cluster was simulated with different 
physics. For the present work we have used a pure dark matter 
simulation. The cluster is at redshift z c = 0.297 and has a main 
halo mass of M200 = 6.7x10 14 M o //l It is in the process of merg- 
ing with a massive substructure of mass M200 ~ 3 x 10 14 M© /h. In 
the projection chosen for this simulation the subclump is located 
at ~ 150 kpc/h north of the main clump. A second massive sub- 
structure is present at a distance of ~ 2.5 Mpc/h from the cluster 



centre. The projected density of the cluster is shown in Fig. 10 
We mimic a 2500" x 2500" SUBARU observation of this clus 
ter and of the sky behind it in the R-band assuming an exposure 
time of 6000s and an isotropic, Gaussian PSF of 0.6" FWHM. 
The distortion field, which is used to lens the background galax- 
ies, is calculated fro m the cluster mass distribu tion following the 
method described in Meneghetti et al. (2007 ). The background 
galaxies have realistic morphologies, being drawn from shapelet 
decompositions of real galaxies taken from the Hubble-Ultra- 
Deep Field (HUDF). Their luminosity and redshift distributions 
also reflect those of the HUDF JCoe et al.|2006] ). 
The weak lensing analysis of the field was carried out by F. 
Bellagamba (Univ. of Bologna) using an advanced KSB method 
( [Kaiser et al.|1995| ). It returned an ellipticity catalogue of 39788 
background sources. For the strong lensing analysis we followed 
two different approaches. First, we used the known complete 
critical curves to obtain a result under optimal conditions. In 
a second reconstruction we choose a kind of worst-case sce- 
nario where we used four estimated points of the critical curve. 
These point were determined following an observational ap- 
proach. First, we identified in the simulation a few fold arcs. 
Then, we isolated the brightest knots along the lensed images, 
which are used to constrain the position of the critical lines pass- 
ing in between them. 

A first qualitative look at the obtained convergence map al- 
ready shows a very nice agreement in orientation, shape and sub- 
structure of the real cluster and the reconstruction. See Fig. [10 



where we show the convergence map of the reconstruction which 
uses only four critical-curve points. In addition we performed 
a more quantiative analysis, by reconstructing the total mass 
within a certain radius. Here we used the average redshift of the 
sources and assumed that the mean convergence vanishes at the 
borders of the field to correct the mass-sheet degeneracy. The re- 



Fig. 9. This figure illustrates how the critical curve estimators are 
obtained from given multiple lensed images of the simulation. In 
the bottom-right corner of the plot we show the pixel size of the 
final, finely resolved iteration. 




0.01 



OJO 



R [Mpc/h] 



Fig. 11. Top panel: The real and the reconstructed total mass of 
the simulated cluster within a certain radius. Bottom panel: The 
residuals from the real mass at different scales. Both panels show 
both reconstructions, based on the full critical-curve knowledge 
and on only a few points of the critical curve respectively. 



5.3. MS 2137 

We finally apply our reconstruction algorithm to the galaxy 
cluster MS 2137. It is a cluster at redshift z c = 0.313, dominated 
by a bright cD galaxy. It shows a giant tangentia l arc, a rad ial arc 
which was the first to be discovered (see |Fort et al.||1992| ), and 
three additional arclets. The spectroscopic redshifts of the arcs 
were determined to be z\ = 1.501 and zi = 1.502 ( [Sand et aL] 
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Fig. 10. Left panel. Real convergence map of the simulated cluster. Right panel. Convergence map of the high-resolution recon- 
struction. The colour scale in both maps is linear and the sidelength corresponds to ~ 10 Mpc. 



2002 ). The tangential arc and two of the arclets belong to the 
same source. The radial arc also produces one counte r image. 
The cluster has been studied several times before, se e |Gayazzi 
W^^M^^GdiY2iZz\\^n^ \ \ComQxfoxd et al.| ( |2006l ) and|Sand 
et al.| p008] ). All these studies used parametric reconstruction 
routines differing from our method. 



The weak-lensing analysis is based on an VLT/FORS obser- 
vation obtained during August 2001. Its results were kindly pro- 
vided by R. Gavazzi (IAP, Paris). The field size is 410 " x 410 ". 
The ellipticity ca talogue was obtained with a KSB -method 
( [Kaiser et al.|1995| ) and returned 1500 ellipticity measurements 
on background galaxies. The arc positions were obtained from 
an HST/WFPC2 exposure using the F702W filter. During the 
reconstruction, the measured arc redshifts were used. Based on 
the experience from the tests with simulated data, we averaged 
over 15 galaxies per reconstruction pixel. The low-resolution re- 
construction was performed on a 25 x 25 pixel grid which was 
gradually refined to 40 x 40 pixels for a reconstruction which re- 
solves the arcs better (see Fig. [12]). Unfortunately, there is a large 
void in the background-galaxy data in the upper middle part of 
the field. In this area, the reconstruction is thus unable to resolve 
any structures. 



the strong lensing regime. The discrepancy between those works 
is discussed in Donnarumm a et al.| (|2009). 



A comparison with former reconstructions is shown in Fig. 13 



In the weak-lensing regime we have a close agreement with the 
Gavazzi ( 2005| ) results, which is expected, since we are using the 
same weak lensing data. In the strong-lensing regime we are in 
good agreem ent with the latest reconstruc tions of [Donnarumma] 
et al.| ( |2009| ) and |Comerford et al.N2006| ), but it tends to prefer 
a lower central mass than Gavazzi (2005). The reason for this is 
still unclear, but the reconstruc tions of |Comerford et al.| ( |2006| ) 
and Donnarum ma et al. ( [2009 ) seem to prefer a lower mass in 




Fig. 12. High resolution reconstruction of MS 2137 on a refined 
40 x 40 pixel grid. The side length corresponds to 1.8 Mpc. The 
contours start at k - 0. 1 and are spaced by Ak = 0. 1 8 
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Fig. 13. Comparison of our results with other reconstructions. 
The plot shows the reconstructed mass within a certain radius. 
Also indidcated is the transition between the weak and the strong 
lensing regime by means of the radius, in which all multiple im- 
ages of the cluster are contained. 



6. Conclusions 



We h ave extended the algorithm proposed by |Cacciato et al. 
( 2006 ) which reconstructs the lensing potential of galaxy clus- 
ters on a two-level grid, combining weak and strong-lensing 
data. Our extensions concern three aspects: First, the shear is re- 
placed by the reduced shear to improve the reconstruction in the 
mildly non-linear regime. The non-linearity introduced in this 
way is solved by an inner iteration loop which preserves the lin- 
ear minimisation of the x 2 function. Second, the spatial resolu- 
tion of the potential grid is gradually increased in an outer it- 
eration loop. While this step causes correlations between neigh- 
bouring pixels, which need to be dealt with using a non-diagonal 
covariance matrix, it prepares for the introduction of a highly re- 
solved grid covering the strong-lensing observations in the clus- 
ter core. Third, we add a regularisation term to the x 2 function, 
avoiding overfitting of noise and allowing a smooth transition 
from the inner, high-resolution to the outer, coarse-resolution 
grid. 

These extensions of the algorithm, especially the introduction of 
the full x 2 -function and the increased number of iterations made 
it necessary to drastically increase the numerical performance of 
the reconstruction routines. This was done with fast, numerical 
multiplication schemes and the parallelisation of the code to run 
on MPI machines. Applications of this algorithm to synthetic 
data show that the known, simulated cluster lenses are accu- 
rately reproduced, although intrinsic ellipticities, measurement 
and shot noise inevitably lead to a considerable noise level of the 
reconstruction in the case of realistic data. Still, the quantitative 
analysis of our convergence maps shows a very good agreement 
with the expected result. An application to the galaxy cluster 
MS 2137 qualitatively confirms the results of earlier, paramet- 
ric studies that show an almost axially symmetric, presumably 
relaxed mass distribution. Although there is still room for po- 
tentially important further improvements, our algorithm should 
now be ready for reliable recoveries of lensing potentials, den- 
sity distributions and density profiles of real galaxy clusters. 
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Appendix A: Linearisation in grid space 



The matrix representation of the finite differences (Eqs. [2T1 [22 



and 



23 ) allows a simplification of the^ minimisation. For weak 



lensing, it reads 



2 _ 2 2 
X\N ~~ X\ X2 



(A.1) 



containing one term each for the two ellipticity components. 

Of course, we have to perform the minimisation for both 
components, but we show only the calculation for one com- 
ponent. Correspondingly, the quantities s,y,C,Q represent 
s l , y l , C 1 , Q l and s 2 , y 2 , C 2 , Q 1 , respectively. One x 2 contribu- 
tion is 



z *r* \ r -i( r z jyj 

l-ZMH\ ej -l-ZjKj 
C7} 



(l-ZiKdil-ZjKj) 



(Si(l - Z t Ki) - Z^i) - ZjKj) - Zjj^j 



= Tij [(Si - SiZiKi - ZijiXsj - SjZjKj - Zjyj)] 

— j^z^y &i& jZjKj S\Zjyj S[S jZ[Ki ~\~ E(E jZ{Z j^i^ j 
^EiZiZjKiJj - SjZiJi + SjZiZjKjJi + ZiZjJiJj] , 

(A.2) 

where we combined the non-linear factors (1 - Zk) in the matrix 
pref actor Tij. We deal w ith this term by the iterative approach 
described in Sect. 4.4.2 In each iteration, this term is simply 
kept constant. 

Now, we minimise this equation with respect to the potential 
values xp 7, 

dx 2 W ! 







(A.3) 



The derivative of x 2 with respect to 07 is 



dX 2 W 



-s^Zj^KjW-SiZj^yjW 

-EiEjZ~Ki(^/) 

a a 

+ £i£jZiZjKi—Kj(lf/) + EiSjZiZjKj—Kiilfr) 

o a 

+ SiZiZjKi—yjiiff) + EiZiZjjj—Ki^/) 

a a 
~ £jZiZj a^ l yi(l//) + £ j ZiZ J K j'M l ^ l//) 

^EjZiZjJ — Kj^) + ZiZjJ — Jj^) 



(A.4) 
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Using ji = ffikiffk, Ki = %k^k and ^%k^k = %khu we can and the result vector 
replace the remaining derivatives, 



= Tij [-SiSjZjftjkSki - SiZjQjkSki - SiSjZfKikhi 

+ £ i £ j Z i Z j c Kik^k c Kjkhi + +£iSjZ i Z j c Kj k il/k%khi 
+ EiZiZfKik^kQjAi + £iZiZj3j k if/ k % k d k i 
- £jZiZjQ ik 5ki + SjZiZfKjktykGik&ki 
J tSjZ i ZjQ i kil/kKjk^ki + ZiZjQiktykGjk&ki 
+ZiZjQ jk^kQikhi\ • 

(A.5) 



4(det^), 



ZfK a • 



(A.13) 



The last equation shows that we can now write Eq. |A.3 
linear system of equations, 



as a 



with the coefficient matrix 

But = Tij [siSjZiZ/K^Kj, + SiSjZiZ/K/Ku + s i Z i Z j % ik g jl 
+ £iZiZj@jk%i + SjZiZjKjkQn + SiZiZjQikKji 
+ZiZjQik@ji + ZiZjQjkQu 



(A.7) 

and the result vector 

*Vz = Tij [siSjKjt + SiZjffji + eiSjZiKu + SjZ&u] . (A.8) 

We now repeat this exercise for the strong lensing term, 
where 

(det^D? {{\-ZiKif-\Ziji\ 2 ) 2 



Xs = 



(A.9) 



Again, we isolate the non-linear terms and take them as constant 
during each iteration step, 



dxtWk) ^ 2(det^), d 



cr. 



2(det^) f 

2(detJ?l) ( - 
cr 2 



—(l-Z iKi (i/, k ) 2 --\Z i7i (i/, k )\ 2 



2(1 - ZiKitykM-Zi—Kityk) 



2(detJ?()i 



[2(1 - ZwtykMrZtKu) ~ IZiyumZig, 



-2Z m ^k)ZiQ\ 
4(detJ?l) ( 



[zfKz'Ka'I'k ~ ZfKu - Z 2 g] k g}^ k - ZfK a 

y/ i 

-zfg 2 k g 2 ^ k ] . 



This yields another linear system, 

K*k = % , 

with the coefficient matrix for strong lensing 



(A. 10) 
(A. 11) 



Also the regularisation term in Eq. ( 25 1 has to be minimised, 



= 2r,i(^-K i )\-^- f K ik ^ k 
= 2i7i(«f-W t )(-^" a ) 



(A. 14) 



which contributes one additional term to the coefficient matrix 
(A.6) anc i tne resu it vector, 



and 



(A. 15) 



(A. 16) 



Finally, we collect the results to obtain the solution for 
Eq.[TT] given in terms of a linear system with the following co- 
efficient matrix 



ijZiZj 



■ [s}s)<K ik <K jl + e}e)'Kj k 'Ku + e\% ik Q\ + s\g) k <K u 

+s'/K jk §\ l + e)§\ k K fl + gj k g l fl + g) k g l a ] 



7 7 



■ [s 2 s 2 <K ik <K jl + s 2 E 2 <K jk <K a + ^Kg^, + e 2 Q 2 k <K u 



v 4(det^) m ^ 2 

+ Zj ^2 Z ™ 



• [^Gnt^CnZ Q\ni0ml @mk@ml\ 



(A. 17) 



and the result vector 



hj 

4(det^) w 



(A.18) 



Z m < \K m i 



n lk 



4(det#) / 



m m 
+ ^ T] n /^K n i , 

where /, j, n indicate the summation over the complete grid, and 



as ^2^2x / a 1 o\ m over those pixels which are assumed to be traversed by a crit- 

z i {% k % l -g ik g il -g ik g il ) (A.12) icalcurve _ 
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